{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**复习：**数据分析的第一步，加载数据我们已经学习完毕了。当数据展现在我们面前的时候，我们所要做的第一步就是认识他，今天我们要学习的就是**了解字段含义以及初步观察数据**。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1 第一章：数据载入及初步观察"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.4 知道你的数据叫什么\n",
    "我们学习pandas的基础操作，那么上一节通过pandas加载之后的数据，其数据类型是什么呢？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**开始前导入numpy和pandas**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.4.1 任务一：pandas中有两个数据类型DateFrame和Series，通过查找简单了解他们。然后自己写一个关于这两个数据类型的小例子🌰[开放题]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'''\n",
    "#我们举的例子\n",
    "sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}\n",
    "example_1 = pd.Series(sdata)\n",
    "example_1\n",
    "'''"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "'''\n",
    "#我们举的例子\n",
    "data = {'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada', 'Nevada'],\n",
    "        'year': [2000, 2001, 2002, 2001, 2002, 2003],'pop': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}\n",
    "example_2 = pd.DataFrame(data)\n",
    "example_2\n",
    "'''\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.4.2 任务二：根据上节课的方法载入\"train.csv\"文件\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "也可以加载上一节课保存的\"train_chinese.csv\"文件。通过翻译版train_chinese.csv熟悉了这个数据集，然后我们对trian.csv来进行操作\n",
    "#### 1.4.3 任务三：查看DataFrame数据的每列的名称"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.4.4任务四：查看\"Cabin\"这列的所有值[有多种方法]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.4.5 任务五：加载文件\"test_1.csv\"，然后对比\"train.csv\"，看看有哪些多出的列，然后将多出的列删除\n",
    "经过我们的观察发现一个测试集test_1.csv有一列是多余的，我们需要将这个多余的列删去"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【思考】还有其他的删除多余的列的方式吗？"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 思考回答\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.4.6 任务六： 将['PassengerId','Name','Age','Ticket']这几个列元素隐藏，只观察其他几个列元素"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【思考】对比任务五和任务六，是不是使用了不一样的方法(函数)，如果使用一样的函数如何完成上面的不同的要求呢？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【思考回答】\n",
    "\n",
    "如果想要完全的删除你的数据结构，使用inplace=True，因为使用inplace就将原数据覆盖了，所以这里没有用"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.5 筛选的逻辑"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "表格数据中，最重要的一个功能就是要具有可筛选的能力，选出我所需要的信息，丢弃无用的信息。\n",
    "\n",
    "下面我们还是用实战来学习pandas这个功能。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.5.1 任务一： 我们以\"Age\"为筛选条件，显示年龄在10岁以下的乘客信息。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.5.2 任务二： 以\"Age\"为条件，将年龄在10岁以上和50岁以下的乘客信息显示出来，并将这个数据命名为midage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【提示】了解pandas的条件筛选方式以及如何使用交集和并集操作"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.5.3 任务三：将midage的数据中第100行的\"Pclass\"和\"Sex\"的数据显示出来"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【提示】在抽取数据中，我们希望数据的相对顺序保持不变，用什么函数可以达到这个效果呢？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.5.4 任务四：使用loc方法将midage的数据中第100，105，108行的\"Pclass\"，\"Name\"和\"Sex\"的数据显示出来"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.5.5 任务五：使用iloc方法将midage的数据中第100，105，108行的\"Pclass\"，\"Name\"和\"Sex\"的数据显示出来"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "#写入代码\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【思考】对比`iloc`和`loc`的异同"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.12"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "582px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
